In order to improve software reliability, software defect prediction is applied to the process of software maintenance to identify\npotential bugs. Traditional methods of software defect prediction mainly focus on designing static code metrics, which are input\ninto machine learning classifiers to predict defect probabilities of the code. However, the characteristics of these artificial metrics\ndo not contain the syntactic structures and semantic information of programs. Such information is more significant than manual\nmetrics and can provide a more accurate predictive model. In this paper, we propose a framework called defect prediction via\nattention-based recurrent neural network (DP-ARNN). More specifically, DP-ARNN first parses abstract syntax trees (ASTs) of\nprograms and extracts them as vectors. Then it encodes vectors which are used as inputs of DP-ARNN by dictionary mapping and\nword embedding. After that, it can automatically learn syntactic and semantic features. Furthermore, it employs the attention\nmechanism to further generate significant features for accurate defect prediction. To validate our method, we choose seven opensource\nJava projects in Apache, using F1-measure and area under the curve (AUC) as evaluation criteria. The experimental results\nshow that, in average, DP-ARNN improves the F1-measure by 14% and AUC by 7% compared with the state-of-the-art\nmethods, respectively.
Loading....